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ABSTRACT 

The study of teacher effectiveness is confronted by a 
number of problems that are generally associated with the conduct of 
behavioral research, it is possible in some instances to resolve or 
circumvent some of the current methodological stumbling blocks that 
tend to reduce the credibility of research findings. This paper 
discusses three methodological problems: (1) the importance of the 
teacher relative to his ability to affect student growth; (2) the 
attempts to operationalize constructs that appear to be related to 
student outcomes; and (3) the statistical problems associated with 
measuring student growth. Several alternative solutions to these 
problem areas are presented. (JMF) 
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Q Researchers who study how teachers affect student behavior are 

^ ^ confronted with the great majority of problems that are generally 

associated with the conduct of behavioral research A perusal of 
the methodological article by Berliner, contained within this issue, 
provides one with an appreciation of the many impediments that are 
inherent in teacher effects research. Although, as the Berliner 
paper repeals, the problems are many and serious, it is possible in 
some instances to resolve or circumvent some of the current methodo- 
logical Stumbling blocks that tend to reduce the credibility of re- 
search findings and discourage raany able educators from conducting 
research »n this area. 

The Purpose of this note is to address three i^thodological 
problems that were frequently discussed, both fornui;/ and informally, 
at the Mationa] Invitational Conference on Research on teacher Effects. 
The three problem areas briefly discussed are somewhat representative 
of the wide range of existing impediments* The importance of the 
teacher relative to his or her ability to affect student growth con- 
stitutes the first problem. The second is that of attempting to 
operational ize constructs that appear to be related to student outcomes. 
The third and final topic concerns a statistical problem associated 
with measuring student growth. 
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Thls note is primarily addressed to two types of educators: 

(a) those who are contemplating doing teacher-effects research and 

(b) those who are presently doing research in this most important 
area. Relative to incipient researchers, our goal is to simply ac- 
quaint them with three of the many methodological problems that they 
will shortly confront. To those able researchers currently attempting 
to link teacher variables to student outcomes, we hope to be able to 
propose an idea or two that might assist them in their important work. 

The Relative Importance of Teaching Variables 

Perhaps the most fundamental problem relative to the conduct, 

interpretation, and appreciation of research into teacher effects is 

the fact that the boundaries of this field of inquiry have not been 
cleariy established. Unfortunately, at present, we cannot answer 
with confidence the following question; what influence can a teacher 
£ersc exert on a child's learning and development? It is obvious 

from the studies contained in this issue, and elsewhere (1), that 
teachers do^ influence the quality and quantity of student learning. 
But, how important are teacher variables in comparison to other known 
correlates of student achievement such as socio-economic class, ability 
level, etc.? Empirically based answers to this question are needed 
if we are to establish realistic expectations for the potential results 
of future studies on teacher effects and if we are to convince public 
audiences and policy makers of the need to support future research 
efforts in this area. 

Parenthetically, the need to determine the relative contribution 
of teacher variables to v «n pupil achievement is particularly 
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Important with respect to the public audiences. As Berliner (2) has 
pointed out, Inferences drawn from the well publicized research of 
Coleman (3) and Jencks (A) have promoted the misleading impression that 
teacher and school variables contribute little to the academic, and 
even economic, attainment of students. Further, the impression has been 
created that the greatest ultimate educational payoff will result from 
working with social and attltudlnal factors at the expense of school 
and teacher variables. Fortunately, there exists today a growing 
awareness that these impressions have been overdrawn. Not only can 
the studies upon which these Inferences are based be challenged on 
both theoretical and methodological grounds, but there exists an in- 
creasing body of research which both advances contradictory finding- 
and demonstrates the promise of additional research into the nature 
of school and teacher variables. 

Recognizing the importance of this issue, recent attempts have beer 
made by educational researchers to assess the relative contributions of 
teacher variables to pupil achievement. McDonald (5), for example, 
has hypothesized that teachers may account for *s much as 25 percent of 
the variance in reading achievement socres at the elementary level. 
McDonald admits, however, that the estimates used in his argument 
were crude. But, there is a more fundamental limitation to current 
attempts to identify the relative contribution of teaching to achieve- 
ment sc variance. Briefly, efforts such as McDonald's concentrate 
on attempting to explain the 20 to kO percent of achievement score 
variance which is not accounted for by the relationship between be- 
ginning-of-year and end-of-year performance. Mot only are teacher 



ERIC 



•k- 



"main effects" not represented In this pool of residual variance, but 
there are simply too many methodological problems inherent in the 
regression-type analysis (e.g., multicoll inearity) to fee. confident 
that the residual variance reflects differential teacher effects. 

We believe that there is a more appropriate methodology to assess 
the contribution of teaching to achievement variance. It is a method- 
° c : ~ ^ v In fields such as agriculture and 

ar ; l ar goals. In animal genetics, for 

in determining, over a generation, 
K id trait in ccws, he or she employs 

a hie._ • or - .. ,1 resign in which the lead factor 

is comprised of a reprsse,,.,. . ,.^ p]e of buMs . Severa] cows are 
then nested within each leve' of the lead variable and the resultant 
progeny of the inevitable 1 iasons are measured for the trait under 
study. Specifically, through the estimation of variance component s, 
the relative genetic contributions made by both bulls and cows are 
established. 

Similar types of educational studies can be readily designed to 
specifically estimate the relative contribution to variance of such 
factors as schools., teachers, and classrooms. The first step of 
such a study might be to assemble a representative sample of schools 
so that the influence of school and community context variab?es can 
be estimated. Since teachers are naturaUy nested within schools and 
classes are usually nested within teachers, these latter two random 
variables can also be built into the design in a hierarchical fashion. 
To avoid problems associ :ed with change scores, pretest and posttest 
achievement scores could be treated as a repealed measurement variable 
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and crossed with all levels of the school and classroom variable. 
Finally, it would be intersting to entertain one or x>ra important 
context variables such as social class or ability le l-» students. 
Such variables could be built Into the design as a w? t, n -class variable. 
Student test scores would then be subjected to AN OVA procedures. Rela- 
tive to the proposed study, by computing variance component estimates 
it would be possible to assess the proportion of achievement score 
variance attributable to: (a) school-ccmmun i ty factors, (b) teacher 
effects, and (c) classrooms. Using techniques appropriate for fixed 
variables (e.g., eta squared or omega squared coefficients), the relative 
contributions of context variables could also be estimated. In con- 
clusion, by conducting a few studies along the lines proposed above, 
it would be only a matter of tine before one of our most troublesome 
methodological problems and public issues would be resolved. 

The Problem of Construct Definition 

In one of the most comprehensive reviews of teacher effects re- 
search, RosenshJne and Furst (6) synthesized the results of approximately 
fifty studies which, for the most part, were studies which correlated 
teacher process variables with student achievement gain. The synthesis 
produced eleven categories of teacher behaviors that were apparently 
related to student achievement. The categories were further broken 
down into a principal set of five behaviors construed to have strong 
research support, and a secondary set of six behaviors judged to have 
weaker support. Members of the principal set in decreasing order of 
apparent strength are as follows: clarity, variability, enthusiasm, 
task-oriented and/or businesslike behaviors, and student opportunity to 
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leam criterion material. The six weaker but promising behaviors are: 
use of student ideas and general indirectness, criticism, use of struc- 
turing comments, types of questions, probing level of difficulty of 
Instructions. 

Although the fifty reviewed studies are subject to criticisms on 
methodological grounds, in toto they do represent the most solid body 
of evidence for consistently demonstrating that teacher behavior is 
related to measures of student achievement. Unfortunately, the behavioral 
complexes supported are just that — complex. Thus, a body of our most 
promising research is plagued by problems of definition and operation- 
al ism. 

As a case in point, consider the teacher-behavior construct v/ith most 
research support, clar? ty . Suppose teacher clarity is defined as ri being 
clear and easy to understand. 11 Obviously, such a definition is circular. 
Yet, this is an examole of the most common kind of clarity definition to 
be found in clarity research. A construct defined i n this manner cannot 
be readily observed or measured. In fact, an observer must infer its 
existence. From a measurement perspective, an observer is required to 
make a rating rather than a simple record of occurrence. Since behaviors 
that demand rating procedures — .termed h?oh-?nference behaviors are by 
nature ambiguous, their use in research sets the stage for evaluating the 
findings of such studies with susoicion. fine potentially profitable method 
for escaping the inherent problems of using h igh- i nf erence variables is 
to identify the? r low- 1 nf erence const ! tuen ts , i.e., behaviors which are 
2fnenable to direct observation and tallying. To the extent to which low- 
inference constituents can be determined, the potential for conducting 
research which will yield more definitive results is increased. 

7 

ERIC 



-7- 

Using the clarity construct as a working example, we propose the 
following blueprint for so reducing this high-inference construct. 
First, a tentative mapping of the domain, in low-inference terms, is 
necessary. One way of getting such a mapping might be to ask a large 
number of students to think of their most "clear" teacher and list some 
specific behaviors that make that particular teacher "clear. 11 Similarly, 
the same operation can be carried out for the most "unclear" teacher. 
Subsequent to obtaining these behaviors, experienced educators can 
analyze and categorize the results into sets containing well-defined, 
easily observable behaviors. 

Next, the tentative mapping can be put to empirical test by first 
asking large samples of stud, ns to think of their most "clear" teacher 
and to relate how often that teacher exhibits each of the behaviors. 
Once similar observations are obtained from students who are instructed 
to think of their most "unclear" teacher, the two sets of data can be 
aggregated and subjected to discriminant function analysis. This mul- 
tivariate technique can be used to discover if the tentative mapping 
distinguishes significantly between teachers perceived to be "clear" 
and those perceived to be "unclear." If so, then those behaviors 
that contribute heavily to the discrimination can be regarded as a 
set of low-inference beho/iors that at least map one portion of the 
clarity construct. 

Such an approach should be heavily replicated for at le jst two 
reasons. First, apparent significance may always be the result of 
chance alone; consequently, replications with similar results are 
needed to strengthen the conjecture that such findings are not chance 
artifacts. Secondly, unknown biases may be in operation when students 
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suggest behaviors or when educators refine them Into workable form. It 
Is possible that such unknown factors may serve to restrict the scope 
of responses, thus preventing comprehensive mapping of the high-inference 
construct. Again replication helps by i ncreas i ng the potential for 
broad coverage of the construct domain. 

In conclusion, it is difficult to argue with the spirit of Rosenshine's 
recent contention that the greatest current need is to conduct more 
research which is designed to link teacher variables with student out- 
comes . However, studies which attempt to further explore the relationships 
between student growth and the eleven or so correlates advanced by 
Rosenshine will be greatly hampered, and their value possibly reduced, 
until serious attention is given to the problem of defining these ab- 
stract constructs in terms of low-inference behaviors. Parenthetically, 
studies whose purpose is to identify the specific components of several 
high-inference constructs are currently being conducted at The Ohio 
State University. 

The Change Score Dilemma 

The emerging paradigm for teacher effects research consists of 
relating promising teacher presage or process measures to measured 
changes in pupil learning. The objective of these studies is to 
identify teacher variables which correlate meaningfully with student 
change or, if the study is an experiment , to identify treatment con- 
ditions which are responsible for maximal gain. Measures of change 
or gain are sometimes calculated by simply subtracting pretest 
scores from posttest scores. However, due to the growing awareness 
that raw change scores are susceptible to regression effects (7), 
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more often researchers attempt to "adjust 11 raw score* for regression 
toward the mean by partialing out differentia! pretest performance. 

Unfortunately, it is known that even 2djusted or residual ized 
gain scores, despite their intuitive appeal, are not suitable measures 
of change (8). A major problem, as mathemat ical ly demonstrated by 
Bereiter (9), is that change scores based on residuals "over-correct. 11 
Specifically, to the extent to which error of measurement is reflected 
in pretest scores, residual ized gain scores will be spuriously large 
for low-pretest performers and spuriously small for students who eari. 
high-pretest scores. Consequently, if the research is descriptive and 
calls for computing correlations between teacher variables and residual- 
ized gain scores, to the extent to which teacher variables covary with 
pretest performance, the resultant correlations will be spurious. 
By way of simple illustration, consider a hypothetical situation in 
which it is desired to estimate the correlation between teacher age and 
mean student gain in reading over a school year. Suppose reasonable 
samples of classrooms are studied where, for each, the age of the 
teacher and the residual ized gain in reading for the year are obtained. 
Now assume that it turns out that the youngest teachers in the sample 
tend to be located in inner-city schools and oldest teachers in outer- 
city schools. Suppose further that outer-city pretest scores are 
higher on the average. In this case, the computed correlation 
coefficient between age and gain would be biased in a negative direc- 
tion, suggesting falsely that students of younger teachers experience 
greater achievement gains. In sum, those wli N study teacher effects are 
conf- ted with a rather serious methodological problem — a problem 
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which is particularly serious because of the modest and fragile 
nature of the correlations that are usually obtained. 

In discussing ways in which the change scores might be minimized, 
It Is Important to separate experimental and correlational research. 
Measuring change is more tractable within an experimental context; 
In fact an experimenter is presented with several alternative methods 
which can completely circumvent the use of change scores. An approach 
which is most justifiable when pupils have been randomly assigned 
to repsective treatment conditions is to perform an analysis of covari- 
ance on posttest scores using pretest scores as the covariable. 
Essentially, this was the strategy employed by Gage in the experiment 
reported in this issue. Even though the over-correct ion phenomenon 
mentioned earlier is still reflected in adjusted posttest scores, the 
random distribution of adjusted scores among treatment conditions se- 
lectively controls for pretreatment inequalities. Analysis of 
covariance should be used cautiously, however, and only by data 
analysts who are familiar with its' many subtle limitations. 

A most direct alternative with less demanding statistical assump- 
tions is to create a blocking variable from pretest scores and build 
this variable into the design of the experiment, in the si nplest case 
where there is a treatment variable crossed wi th the pretest variable, 
a standard two-factor ANOVA is performed on the posttest scores. If 
the subjects have been randomly assigned to treatment conditions, not 
only is the analysis capable of documenting significant gain, but it 
is also capable of detecting interactions between levels of pretest 
performance and treatments. Feldt (10) has discussed several advantages 
of this design in comparison to using analysis of covariance. 

11 
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A third experimental option is to treat pretest and posttest scores 
as a single factor and to build this factor into the design as a repeated 
meaS ' Jreme,,tS Variab,e ' »»■*»•« case, a treatment variable con- 

sfnts of the pretest and posttest scores. ,f tnere should be greater 
9ain under some treatment conditions, it will be detected by the presence, 
of significant treatment by pre-post testing interaction in tne AMOVA. 
'f It has not been possible to initially equate treatment groups, this 

option is particularly active because the means to detect pre- 

treatment biases are readily available. 

't is clear that for experimental work, there are ample alternatives 

to the use of change scores. He.,ce, as Cronbach and Furby (8) have 

concluded, "There appears t0 be nQ peed to ^ ^ ^ ^ 

dependent variables and no virtue in using them." 

Unfortunately, overcoming problems associated with change scores 
Is not as easy when the research is of the correlational type. There 
" exists, however, a method which has been shown by Lord (11) t0 be 
superior to computing correlations which involve residua! gain scores. 
The method consists of: (a) completely correcting zero-order correla- 
tion for the unreliable variance in each measure (i.e., correcting for 
attenuation) and (b) using these corrected correlations to compute 
semi. -partial correlation coefficients where pretest performance has 
been partialed out of posttest performance. 

To illustrate the Lord method, consider again the relationship 
between teacher's age and gain in reading during the school year. One 
rtrst obtains reasonable estimates of the reliability of age, pretest, 
and posttest variables. Using the reliability information within th/ 
context of standard formulae (12: p. , 55 ), the semi-partial correlation 
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between teacher's age and posttest scores ss calculated. The resultant 
semi-partial correlation represents the correlation between age and 
reading achievement subsequent to removing initial reading ability, 
from the posttest reading measure and further possesses the advantage 
that it is least vulnerable to the "over-correct ion 11 problem mentioned 
earlier. Granted, greater labor is expended in using this approach, 
but considering the importance of the relations being sought, this 
methodology should be used far more extensively in descriptive studies 
of teacher effects. 

Concluding Remarks 

In this brief note we have only been able to mention three out of 
the vast array of impediments associated with scientific inquiry into 
the nature of teacher effects. Admittedly, we selected these three 
because, in our view, potential remedies lie close az hand. If nothing 
else, our purpose has been to show that some of the methodological 
obstacles confronting educational research can be overcome and to 
demonstrate that methodologically respectable teacher-effects research 
can be conducted. We hope that this article will encourage others 
to respond to this crucial need and challenge. 
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